feat(power): measured-power multinode support (workers[] + per-stage joules)#405
Conversation
…orm layer The multinode runner emits per-role power scalars (prefill_avg_power_w, decode_avg_power_w, joules_per_input_token, joules_per_output_token_decode) and a per-worker breakdown array on disagg runs. Surface them on BenchmarkRow -> AggDataEntry -> InferenceData so chart features can consume them without further transform changes. Workers ride as a sibling of the scalar metrics dict to keep the existing Record<string, number> index intact.
App-side support for the InferenceX runner's multinode measured-power telemetry: per-worker power breakdown (workers[] with role / num_gpus / hosts), per-stage joules (input = prefill energy, output = decode energy), and cluster-wide temp / util / mem. - db: migration 006_benchmark_results_workers.sql; ETL mapper + ingest + queries + json-provider carry workers[] and the new metric scalars - constants: metric-keys lists the measured-power fields; joules_per_output_token is per-stage decode on disagg (no separate _decode key) - app: benchmark-transform + inference types surface the fields; unofficial-run route passes them through; cypress measured-power-overlay e2e - drop joules_per_output_token_decode (folded into joules_per_output_token) to match the runner's per-stage change Validated: app 60 + db 223 unit tests pass; tsc --noEmit clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Claude finished @arygupt's task in 2m 58s —— View job Claude Code Review
🟢 LGTM — no blocking issues found. Reviewed the multinode measured-power plumbing end-to-end. The diff is clean and defensive at every JSON boundary:
Test coverage is solid: dedicated unit tests for Minor observation (non-blocking): the |
This comment has been minimized.
This comment has been minimized.
…serting visibility
The measured-power-overlay e2e asserted `.should('be.visible')` on the gated
"Measured Energy" group + its options right after opening the Y-axis dropdown.
That list is a scroll container (searchable-select: max-h-72 overflow-y-auto)
and the group sits below the fold, so the items are in the DOM but clipped by
the overflow ancestor. Cypress `be.visible` does not auto-scroll, so it failed
deterministically on both the chrome and firefox e2e shard-2 jobs. The sibling
test passed only because `.click()` auto-scrolls its target into view.
Add `.scrollIntoView()` before each visibility assertion.
Verified locally: measured-power-overlay.cy.ts 2/2 passing against the real
run 26312107787 artifact with E2E_FIXTURES=1.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Claude finished @arygupt's task in 50s —— View job Claude Code Review (Re-review)
🟢 Re-review: LGTM — no blocking issues found. Only one new commit since the previous review (9847aef): a focused Cypress fix that adds Root-cause analysis in the commit message matches the code:
Previous review feedback on the multinode telemetry plumbing (extractWorkers parsing, JSONB null handling, materialized-view recreation, overlay path symmetry) all still applies and remains addressed. The non-blocking observation about consolidating the |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 9847aef. Configure here.
…age dup The read-side BenchmarkWorkerRow interface in queries/benchmarks.ts was a verbatim redeclaration of the ingest-side WorkerPower in etl/benchmark-mapper.ts. Alias it to the canonical definition so the per-worker shape can't drift within the package, keeping the BenchmarkWorkerRow name its consumers reference. Addresses the Cursor Bugbot "triplicate WorkerPower interface" finding on #405. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Claude finished @arygupt's task in 52s —— View job Claude Code Review (Re-review)
🟢 Re-review: LGTM — no blocking issues found. One new commit since the previous review ( Verified:
All prior review findings on the multinode telemetry plumbing (extractWorkers parsing, JSONB null handling, materialized-view recreation, overlay path symmetry, test coverage) remain addressed and still apply. |

What
App-side support for the InferenceX runner's multinode measured-power telemetry — the dashboard counterpart to SemiAnalysisAI/InferenceX#1574.
Surfaces, end-to-end (ingest → DB → transform → types):
workers[], each labeledprefill/decodewithnum_gpus,hosts[], and per-workeravg_power_w(+ optional temp/util/mem).joules_per_input_token= prefill energy ÷ input;joules_per_output_token= decode energy ÷ output (per-stage, decode GPUs only).joules_per_total_tokenstays cluster-wide.avg_temp_c,peak_temp_c,avg_util_pct,avg_mem_used_mb.Changes
006_benchmark_results_workers.sql; ETL mapper + ingest + queries + json-provider carryworkers[]and the new metric scalars.metric-keyslists the measured-power fields.benchmark-transform+ inferencetypessurface the fields;unofficial-runroute passes them through; cypressmeasured-power-overlaye2e.Schema note (pairs with the runner change)
joules_per_output_tokenis now the per-stage decode value on disagg (symmetric withjoules_per_input_token), matching the reviewer's spec on #1574. The oldjoules_per_output_token_decodekey is removed (folded intojoules_per_output_token) — it was only plumbed through, never rendered; the J/output chart already readsjoules_per_output_token.Validation
tsc --noEmitclean; lint (oxlint) 0 warnings/errorsDepends on
Runner PR SemiAnalysisAI/InferenceX#1574 (emits these agg-JSON fields). Land/coordinate together.
🤖 Generated with Claude Code
Note
Medium Risk
Adds a schema migration and materialized-view rebuild plus broad ETL/API contract changes; behavior is well-tested but must stay aligned with the companion runner PR and production migration order.
Overview
Adds end-to-end support for multinode / disaggregated measured-power telemetry from the runner: per-worker breakdowns, disagg role-split scalars, and cluster-wide temp/util/memory metrics.
Database & ingest: Migration
006adds a dedicatedbenchmark_results.workersJSONB column (kept out of flatmetrics), recreateslatest_benchmarks, and extends bulk ingest upserts. ETL gainsextractWorkers()with validation, excludesworkersfrom numeric metrics, and registers new scalar keys inMETRIC_KEYS.Read path: Benchmark queries, json-provider dumps, API
BenchmarkRow, unofficial-run normalization, andbenchmark-transformall pass throughworkers[]plus scalars likeprefill_avg_power_w,decode_avg_power_w,joules_per_input_token, andavg_temp_c—preserving undefined vs zero for legacy rows.App types & tests:
WorkerPower/ extendedAggDataEntrytypes document disagg semantics (joules_per_output_tokenas per-stage decode on disagg). Unit tests cover mapper, transform, and unofficial-run routes; Cypress verifies measured-energy Y-axis options on an unofficial-run overlay.Reviewed by Cursor Bugbot for commit db9b15d. Bugbot is set up for automated code reviews on this repo. Configure here.